So, as was mentioned in a couple comments containing the data in an array is simpler but the solution does not scale well in terms of efficiency as the data set size increases. You really should only use an iterable object when you want to access a random item in the array, otherwise, generators are the way to go. Below I have prototyped a reader function which reads each json object individually and returns a generator.
The basic idea is to signal the reader to split on the carriage character "\n"
(or "\r\n"
for Windows). Python can do this with the file.readline()
function.
import json
def json_reader(filename):
with open(filename) as f:
for line in f:
yield json.loads(line)
However, this method only really works when the file is written as you have it -- with each object separated by a newline character. Below I wrote an example of a writer that separates an array of json objects and saves each one on a new line.
def json_writer(file, json_objects):
with open(file, "w") as f:
for jsonobj in json_objects:
jsonstr = json.dumps(jsonobj)
f.write(jsonstr + "\n")
You could also do the same operation with file.writelines()
and a list comprehension:
...
json_strs = [json.dumps(j) + "\n" for j in json_objects]
f.writelines(json_strs)
...
And if you wanted to append the data instead of writing a new file just change open(file, "w")
to open(file, "a")
.
In the end I find this helps a great deal not only with readability when I try and open json files in a text editor but also in terms of using memory more efficiently.
On that note if you change your mind at some point and you want a list out of the reader, Python allows you to put a generator function inside of a list and populate the list automatically. In other words, just write
lst = list(json_reader(file))